The Speech ManagerSpeech Manager Overview 2Speech Manager Concepts 3Using the Speech Manager 4Getting Started 4Determining If the Speech Manager Is Available 4Which Version of the Speech Manager Is Running? 5Making Some Noise 5Determining If Speaking Is Complete 6A Simple Example 6Essential Calls—Simple and Useful 7Working With Voices 7Managing Connections to Speech Synthesizers 11Starting and Stopping Speech 13Using Basic Speech Controls 14Putting It All Together 17Advanced Routines 18Advanced Speech Controls 19Converting Text Into Phonemes 23Getting Information About a Speech Channel 24Advanced Control Routines 30Application-Defined Pronunciation Dictionaries 36Associating a Dictionary With a Speech Channel 37Pronunciation Dictionary Data Format 38Creating and Editing Dictionaries 39Advanced Voice Information Routines 39Embedded Speech Commands 40Embedded Speech Command Syntax 41Embedded Speech Command Set 42Embedded Speech Command Error Reporting 45Summary of Phonemes and Prosodic Controls 45Summary of the Speech Manager 49This document describes the Apple® Speech Manager, which provides a standardized method for Macintosh® applications to generate synthesized speech.The document provides an overview of the Speech Manager followed by general information about generating speech from text. The necessary information and calls needed by all text-to-speech applications are given next, followed by a simple example of speech generation. More advanced calls and special-purpose routines are described last.Speech Manager OverviewA complete system for speech synthesis consists of the elements shown in Figure 1-1.Figure 1-1 Speech synthesis componentsAn application calls routines in the Speech Manager to convert character strings into speech and to adjust various parameters that affect the quality or character of the spoken output. The Speech Manager is responsible for dispatching these requests to a speech synthesizer. The speech synthesizer converts the text into sound and creates the actual audio output. The Apple-supplied voices, pronunciation dictionaries, and speech synthesizer may reside in a single file or in separate files. These files are clearly identifiable as Speech Manager–related files and are installed and removed by being dragged into or out of the System Folder. Additional voices can be provided by bundling the resources in the resource forks of specific applications. These resources are considered private to that particular application. It is up to the individual developers to decide whether the voice resources they provide are usable on a systemwide basis or only from within their applications. In the first release of the Speech Manager, pronunciation dictionaries are managed entirely by the application. The application is free to store dictionaries in either the resource or the data fork of a file. The application is responsible for loading the individual dictionaries into RAM and then passing a handle to the dictionary data to the Speech Manager.Applications that use the Speech Manager must provide their own human interface for selecting voices and/or controlling other speech characteristics. If voices are provided in separate files, the speech synthesizer developer is responsible for providing a method for installing these resources into the System Folder or Extensions folder. The computer must be rebooted after speech synthesizers are added to or removed from the System Folder for the desired changes to be recognized.Speech Manager ConceptsOn a simple level, speech synthesis from text input is a two-stage process. First, plain-language English text is converted into phonemic representations for the individual words. Phonemes stand for specific sounds; for a complete explanation, see “Summary of Phonemes and Prosodic Controls,” later in this document. The resulting sequence of phonemes is converted into audible sounds by mapping of the individual phonemes to a series of waveforms, which are sent to the sound hardware to be played.In reality, each stage is more complicated than this description suggests. For example, during the text-to-phoneme conversion stage, number strings, abbreviations, and special symbols must be detected and converted into appropriate words before being converted into phonemes. When a sentence such as “He earned over $2,000,000 in 1990” is spoken, it would normally be preferable to say “He earned over two million dollars in nineteen- ninety” rather than “He earned over dollar-sign, two, comma, zero, zero, zero, comma, zero, zero, zero, in one, nine, nine, zero.” To produce the desired spoken output automatically, knowledge of these sorts of constructions is built into the synthesizer.The phoneme-to-sound conversion stage is also complex. Phonemes by themselves are often not sufficient to describe the way a word should be pronounced. For example, the word “object” is pronounced differently depending on whether it is used as a noun or a verb. (When it is used as a noun, the stress is placed on the first syllable. As a verb, the stress is placed on the second syllable.) In addition to stress information, phonemes must often be augmented with pitch, duration, and other information to produce intelligible, natural-sounding speech.The speech synthesizer has many built-in rules for automatically converting text into the complex phonemic representation described above. However, there will always be words and phrases that are not pronounced the way you want. The Speech Manager allows you to provide raw phonemic information directly in order to enable very precise control over the spoken output.By default, speech synthesizers expect input in normal language text. However, using the input mode controls of the Speech Manager, you can tell the synthesizer to process input text in raw phonemic form. By using the embedded commands described in the next section, you can even mix normal language text with phonemic text within a single string or text buffer.See “Summary of Phonemes and Prosodic Controls,” later in this document, for a listing of the phonemic character set and each character’s interpretation.Using the Speech ManagerThis section describes the routines used to add speech synthesis features to an application. It is organized into three sections: “Getting Started” (Easy), “Essential Calls—Simple and Useful” (Intermediate), and “Advanced Routines.”Getting StartedIf you’re just getting started with text-to-speech conversion using the Speech Manager, the following routines will get you up and running with minimal effort. If you’re developing an application that does not need to choose voices, use more than one channel of speech, or exercise real-time control over the synthesized speech, these may be the only routines you need.Determining If the Speech Manager Is AvailableYou can find out if the Speech Manager is available with a single call to the Gestalt Manager. Use the Gestalt toolbox routine and the selector gestaltSpeechAttr to determine whether or not the Speech Manager is available, as shown in Listing 1-1. If Gestalt returns noErr, then the parameter argument will contain a 32-bit value indicating one or more attributes of the installed Speech Manager. If the Speech Manager exists, the bit specified by gestaltSpeechMgrPresent is set.Listing 1-1 Determining if the Speech Manager is availableBoolean SpeechAvailable (void) { OSErr err; long result; err = Gestalt(gestaltSpeechAttr, &result); if ((err != noErr) || !(result & (1 << gestaltSpeechMgrPresent))) return FALSE; else return TRUE;}Which Version of the Speech Manager Is Running?Once you have determined that the Speech Manager is installed, you can see which version of the Speech Manager is running by calling SpeechManagerVersion.SpeechManagerVersionReturns the version of the Speech Manager installed in the system.pascal NumVersion SpeechManagerVersion (void);DESCRIPTIONSpeechManagerVersion returns the version of the Speech Manager installed in the system. This call should be used to determine the compatibility of your program with the currently installed Speech Manager.RESULT CODESNoneMaking Some NoiseThe most basic operation of the Speech Manager is accomplished by using the SpeakString call. This call passes a specific text string to be spoken to the Speech Manager.SpeakStringThe SpeakString function passes a specific text string to be spoken to the Speech Manager.pascal OSErr SpeakString (StringPtr myString);Field descriptionsmyString Text string to be spokenDESCRIPTIONSpeakString attempts to speak the Pascal-style text string contained in myString. Speech is produced asynchronously using the default system voice. When an application calls this function, the Speech Manager makes a copy of the passed string and creates any structures required to speak it. As soon as speaking has begun, control is returned to the application. The synthesized speech is generated transparently to the application so that normal processing can continue while the text is being spoken. No further interaction with the Speech Manager is required at this point, and the application is free to release or purge or trash the original string.If SpeakString is called while a prior string is still being spoken, the audio currently being synthesized is interrupted immediately. Conversion of the new text into speech is then initiated. If an empty (zero length) string or a null string pointer is passed to SpeakString, it stops the synthesis of any prior string but does not generate any additional speech.As with all Speech Manager routines that expect text arguments, the text may contain embedded speech control commands. Result CodesnoErr 0 No error memFullErr –108 Not enough memory to speak synthOpenFailed –241 Could not open another speech synthesizer channel Determining If Speaking Is CompleteOnce an application starts a speech process with SpeakString, the next thing it will probably need to know is whether the string has been completely spoken. It can use SpeechBusy to determine whether or not the system is still speaking.SpeechBusyThe SpeechBusy routine is useful when you want to ensure that an earlier speech request has been completed before having the system speak again.pascal short SpeechBusy (void);DESCRIPTIONSpeechBusy returns the number of channels of speech that are currently synthesizing speech in the application. If you use just SpeakString to initiate speech, SpeechBusy will always return 1 as long as speech is being produced. When SpeechBusy returns 0, all initiated speech has finished. RESULT CODESNoneA Simple ExampleThe example shown in Listing 1-2 demonstrates how to use the routines introduced in this section. It first makes sure the Speech Manager is available. Then it starts speaking a string (hard-coded in this example, but more commonly loaded from a resource) and loops, doing some screen drawing, until the string is completely spoken. This example uses the SpeechAvailable routine shown in Listing 1-1.Listing 1-2 Elementary Speech Manager callsOSErr err;if (SpeechAvailable()) { err = SpeakString("\pThe cat sat on the mat."); if (err == noErr) while (SpeechBusy() > 0) CoolAnimationRoutine(); else NotSoCoolAlertRoutine(err);}Essential Calls—Simple and UsefulWhile the routines presented in the last section are simple to use, their applicability is limited to a few basic speech scenarios. This section describes additional routines that let you work with different voices and adjust some basic characteristics of the synthesized speech.Working With VoicesWhen describing a person’s voice, we talk about the particular set of characteristics that help us to distinguish that person’s voice from another. For example, the rate at which one speaks (slow or fast) and the average pitch (high or low) characterize a particular speaker on a crude level. In the context of the Speech Manager, a voice is the set of parameters that specify a particular quality of synthesized speech. This portion of the Speech Manager is used to determine which voices are available and to select particular voices.Every specific voice has a unique ID associated with it, which is the primary way an application refers to it. Within the Speech Manager a unique voice ID is called a VoiceSpec structure.The Speech Manager provides two routines to count and step through the list of currently available voices. CountVoices is used to compute how many voices are available with the current system. GetIndVoice uses an index, starting at 1, to return information about all currently installed voices.Use the GetIndVoice routine to step through the list of available voices. It will fill a VoiceSpec record that can be used to obtain descriptive information about the voice or to speak using that voice.Any application that wishes to use multiple voices will probably need additional information about the available voices beyond the VoiceSpec structure, such as the name of the voice and perhaps what script and language each voice supports. This information might be presented to the user in a “voice picker” dialog box or voice menu, or it might be used internally by an application trying to find a voice that meets certain criteria. Applications can use the GetVoiceDescription routine for these purposes.MakeVoiceSpecTo maximize compatibility with future versions of the Speech Manager, you should always use MakeVoiceSpec instead of setting the fields of the VoiceSpec structure directly.pascal OSErr MakeVoiceSpec (OSType creator, OSType id, VoiceSpec *voice);typedef struct VoiceSpec { OSType creator; // determines which synthesizer is required OSType id; // voice ID on the specified synth } VoiceSpec;Field descriptionscreator The synthesizer required by your applicationid Identification number for this voice*voice Pointer to the VoiceSpec structureDESCRIPTIONMost voice management routines expect to be passed a pointer to a VoiceSpec structure. MakeVoiceSpec is a utility routine provided to facilitate the creation of VoiceSpec records. On return, the passed VoiceSpec structure contains the appropriate values.Voices are stored in resources of type 'ttsv' in the resource fork of Macintosh files. The Speech Manager uses the same search method as the Resource Manager, looking for voice resources in three different locations when attempting to resolve VoiceSpec references. It first looks in the application’s resource file chain. If the specified voice is not found in an open file, it then looks in the System Folder and the Extensions folder (or in just the System Folder under System 6) for files of type 'ttsv' (single-voice files) or 'ttsb' (multivoice files) and in text-to-speech synthesizer component files (file type 'INIT' or 'thng'). Voices stored in the System Folder or Extensions folder are normally available to all applications. Voices stored in the resource fork of an application files are private to the application.RESULT CODEnoErr 0 No error While the determination of specific voice ID values is mostly left to synthesizer developers, the voice creator values are specified by Apple (they would ordinarily correspond to a developer’s currently assigned creator ID). For both the creator and id fields Apple further reserves the set of OSType values specified entirely by space characters and lowercase letters. Apple is establishing a standard suite of voice ID values that developers can count upon being available with all speech synthesizers.CountVoicesThe CountVoices routine returns the number of voices available.pascal OSErr CountVoices (short *voiceCount);Field descriptionsvoiceCount Number of voices available to the applicationDESCRIPTIONEach time CountVoices is called, the Speech Manager searches for new voices. This algorithm supports dynamic installation of voices by applications or users. On return, the voiceCount parameter contains the number of voices available.RESULT CODESnoErr 0 No error GetIndVoiceThe GetIndVoice routine returns information about a specific voice.pascal OSErr GetIndVoice (short index, VoiceSpec *voice);Field descriptionsindex Index value for a specific voice*voice Pointer to the VoiceSpec structureDESCRIPTIONAs with all other index-based routines in the Macintosh Toolbox, an index value of 1 causes GetIndVoice to return information for the first voice. The order that voices are returned is not presently defined and should not be assumed. Speech Manager behavior when voice files or resources are added, removed, or modified is also presently undefined. However, calling CountVoices or GetIndVoice with an index of 1 will force the Speech Manager to update its list of available voices. GetIndVoice will return a voiceNotFound error if the passed index value exceeds the number of available voices.RESULT CODESnoErr 0 No error voiceNotFound –244 Voice resource not found GetVoiceDescriptionThe GetVoiceDescription routine returns information about a voice beyond that provided by GetIndVoice.pascal OSErr GetVoiceDescription (VoiceSpec *voice, VoiceDescription *info, long infoLength);enum {kNeuter = 0, kMale, kFemale}; // returned in gender field below typedef struct VoiceDescription { long length; // size of structure VoiceSpec voice; // synth and ID info for voice long version; // version code for voice Str63 name; // name of voice Str255 comment; // additional text info about voice short gender; // neuter, male, or female short age; // approximate age in years short script; // script code of text voice can process short language; // language code of voice output speech short region; // region code of voice output speech long reserved[4]; // always zero - reserved } VoiceDescription;Field descriptions*voice Pointer to the VoiceSpec structure*info Pointer to structure containing parameters for the specified voiceinfoLength Length in bytes of info structureDESCRIPTIONThe Speech Manager fills out the passed VoiceDescription fields with the correct information for the specified voice. If a null VoiceSpec pointer is passed, the Speech Manager returns information for the system default voice. If the VoiceDescription pointer is null, the Speech Manager simply verifies that the specified VoiceSpec refers to an available voice. If VoiceSpec does not refer to a known voice, GetVoiceDescription returns a voiceNotFound error, as shown in Listing 1-3.To maximize compatibility with future versions of the Speech Manager, the application must pass the size of the VoiceDescription structure. Having the application do this ensures that the Speech Manager will never write more data into the passed structure than will fit even if additional information fields are defined in the future. On returning from GetVoiceDescription, the length field is set to reflect the length of data actually written by this routine.Listing 1-3 Getting information about a voiceOSErr GetVoiceGender (VoiceSpec *voicePtr, short *gender) { OSErr err; VoiceDescription vd; err = GetVoiceDescription (voicePtr,&vd,sizeof(VoiceDescription)); if (err == noErr) { if (vd.length > offsetof(VoiceDescription,gender)) *gender = vd.gender; else err = badStructLen; } return err;}RESULT CODESnoErr 0 No error paramErr –50 Parameter error memFullErr –108 Not enough memory to load voice into memory voiceNotFound –244 Voice resource not found Managing Connections to Speech SynthesizersUsing the routines described earlier in this document, an application can select the voice with which to speak. The next step is to associate the selected voice with the proper speech synthesizer. This is accomplished by creating a new speech channel with the NewSpeechChannel routine. A speech channel is a private communication connection to the speech synthesizer, much as a file reference number is a communication channel to an open file in the Macintosh file system.The DisposeSpeechChannel routine closes a speech channel when the application is finished with it and releases any resources that have been allocated to support the speech synthesizer and are no longer needed.NewSpeechChannelThe NewSpeechChannel routine creates a new speech channel.pascal OSErr NewSpeechChannel (VoiceSpec *voice, SpeechChannel *chan);Field descriptions*voice Pointer to the VoiceSpec structure*chan Pointer to the new channelDESCRIPTIONThe Speech Manager automatically locates and opens a connection to the proper synthesizer for a specified voice and sets up a channel at the location pointed to by *chan so that it is ready to speak with that voice. If a null VoiceSpec pointer is passed to NewSpeechChannel, the Speech Manager uses the current system default voice.There is no predefined limit to the number of speech channels an application may create. However, system constraints on available RAM, processor loading, and number of available sound channels may limit the number of speech channels actually possible.RESULT CODESnoErr 0 No error memFullErr –108 Not enough memory to open speech channel synthOpenFailed –241 Could not open another speech synthesizer channel voiceNotFound –244 Voice resource not found DisposeSpeechChannelThe DisposeSpeechChannel routine disposes of an existing speech channel.pascal OSErr DisposeSpeechChannel (SpeechChannel chan);Field descriptionschan Specific speech channelDESCRIPTIONThis routine disposes of an existing speech channel. Any speech channels that have not been explicitly disposed of by the application are released automatically by the Speech Manager when the application quits.RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter Starting and Stopping SpeechAll the remaining routines in this section require a valid speech channel to work properly. Once the application has successfully created a speech channel, it can start to speak. You use the SpeakText routine to begin speaking on a speech channel.At any time during the speaking process, the application can stop the synthesizer’s speech. The StopSpeech routine will immediately abort any speech being produced on the specified speech channel and force the channel back into an idle state.SpeakTextThe SpeakText routine converts a designated text into speech.pascal OSErr SpeakText (SpeechChannel chan, Ptr textBuf, long byteLength);Field descriptionschan Specific speech channeltextBuf Buffer of textbyteLength Length of textBufDESCRIPTIONIn addition to a valid speech channel, SpeakText expects a pointer to the text to be spoken and the length in bytes of the text buffer. SpeakText will convert the given text stream into speech using the voice and control settings for that speech channel. The speech is generated asynchronously. This means that control is returned to your application before the speech has finished (probably even before it has begun). The maximum length of text buffer that can be spoken is limited only by the available RAM. However, it’s generally not very friendly to force the user to listen to long uninterrupted text unless the user requests it. If SpeakText is called while it is currently busy speaking the contents of a prior text buffer, it will immediately stop speaking from the prior buffer and will begin speaking from the new text buffer as soon as possible. As with SpeakString, described on page 5, if an empty (zero length) string or a null text buffer pointer is passed to SpeakText, this will have the effect of stopping the synthesis of any prior text but will not generate any additional speech.sWARNINGWith SpeakText, unlike with SpeakString, the text buffer must be locked in memory and must not move during the entire time that it is being converted into speech. This buffer is read at interrupt time, and very undesirable effects will happen if it moves or is purged from memory.sRESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter StopSpeechThe StopSpeech routine terminates speech delivery on a specified channel.pascal OSErr StopSpeech (SpeechChannel chan);Field descriptionschan Specific speech channelDESCRIPTIONAfter returning from StopSpeech, the application can safely release any text buffer that that the speech synthesizer has been using. The SpeechBusy routine, described on page 6, can be used to determine if the text has been completely spoken. (In an environment with multiple speech channels, you may need to use the more advanced status routine GetSpeechInfo, described on page 25, to determine if a specific channel is still speaking.) StopSpeech can be called for an already idle channel without ill effect. RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter Using Basic Speech ControlsThe Speech Manager provides several methods of adjusting the variables that can affect the way speech is synthesized. Although most applications probably do not need to use these advanced features, two of the speech variables, speaking rate and speaking pitch, are useful enough that a very simple way of adjusting these parameters on a channel-by-channel basis is provided. Routines are supplied that enable an application to both set and get these parameters. However, the audible effects of changing the rate and pitch of speech vary from synthesizer to synthesizer; you should test the actual results on all synthesizers with which your application may work.Speaking rates are specified in terms of words per minute (WPM). While this unit of measurement is difficult to define in any precise way, it is generally easy to understand and use. The range of supported rates is not predefined by the Speech Manager. Each speech synthesizer provides its own range of speaking rates. Furthermore, any specific rate value will correspond to slightly different rates with different synthesizers. Speaking pitches are defined on a musical scale that corresponds to the keys on a standard piano keyboard. By convention, pitches are represented as fixed-point values in the range from 0.000 through 100.000, where 60.000 corresponds to middle C (261.625 Hz) on a conventional piano. Pitches are represented on a logarithmic scale. On this scale, a change of +12 units corresponds to doubling the frequency, while a change of –12 units corresponds to halving the frequency. For a further discussion of pitch values, see “Getting Information About a Speech Channel,” later in this document.Typical voice frequencies might range from around 90 Hertz for a low-pitched male voice to perhaps 300 Hertz for a high-pitched child’s voice. These frequencies correspond to pitch values of 41.526 and 53.526, respectively.Changes in speech rate and pitch are effective immediately (as soon as the synthesizer can respond), even if they occur in the middle of a word.SetSpeechRateThe SetSpeechRate routine sets the speaking rate on a designated speech channel.pascal OSErr SetSpeechRate (SpeechChannel chan, Fixed rate);Field descriptionschan Specific speech channelrate Word output speaking rateDESCRIPTIONThe SetSpeechRate routine is used to adjust the speaking rate on a speech channel. The rate parameter is specified as a fixed-point, words per minute value. As a general rule of thumb, “normal” speaking rates range from around 150 WPM to around 180 WPM. It is important when working with speaking rates, however, to keep in mind that users will differ greatly in their ability to understand synthesized speech at a particular rate based upon their level of experience listening to the voice and their ability to anticipate the types of utterances they will encounter.RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter GetSpeechRateThe GetSpeechRate routine returns the speech rate currently active on a designated speech channel.pascal OSErr GetSpeechRate (SpeechChannel chan, Fixed *rate);Field descriptionschan Specific speech channel*rate Pointer to the current speaking rateDESCRIPTIONThe GetSpeechRate routine is used to find out the speaking rate currently active on a speech channel.RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter SetSpeechPitchThe SetSpeechPitch routine sets the speaking pitch on a designated speech channel.pascal OSErr SetSpeechPitch (SpeechChannel chan, Fixed pitch);Field descriptionschan Specific speech channelpitch Frequency of voiceDESCRIPTIONUse the SetSpeechPitch routine to change the current speaking pitch on a speech channel.RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter GetSpeechPitchThe GetSpeechPitch routine returns the current speaking pitch on a designated speech channel.pascal OSErr GetSpeechPitch (SpeechChannel chan, Fixed *pitch);Field descriptionschan Specific speech channelpitch Frequency of voiceDESCRIPTIONThe GetSpeechPitch routine is used to find out the speaking pitch currently active on a speech channel.RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter Putting It All TogetherThe code fragment in Listing 1-4 illustrates many of the routines introduced in this section. The example steps through the list of available voices to find the first female voice. Then it creates a new speech channel and begins speaking. While the voice is speaking, the pitch of the voice is continually adjusted around the original pitch. If the mouse button is pressed while the voice is speaking, the code halts the speech and exits. This example uses the SpeechAvailable and GetVoiceGender routines shown earlier in Listing 1-1 and Listing 1-3.Listing 1-4 Putting it all togetherOSErr err;Str255 myStr = "\pThe bat sat on my hat.";VoiceSpec voice;VoiceDescription vd;Boolean gotVoice = FALSE;short voiceCount, gender, i;SpeechChannel chan;Fixed origPitch, newPitch;if (myStr[0] && SpeechAvailable()) { err = CountVoices(&voiceCount); // count the available voices i = 1; while ((i <= voiceCount) && ((err=GetIndVoice(i++, &voice)) ==noErr)) { err = GetIndVoice(i++, &voice)) == noErr; err = GetVoiceGender(&voice, &gender); if ((err == noErr) && (gender == kFemale)) { gotVoice = TRUE; break; } } if (gotVoice) { err = NewSpeechChannel(&voice, &chan); if (err == noErr) { err = GetSpeechPitch(chan, &origPitch); // cur pitch if (err == noErr) err = SpeakText(chan, &myStr[1], myStr[0]); i = 0; if (err == noErr) while (SpeechBusy() > 0) { CoolAnimationRoutine(); newPitch = (i - 4) << 16; // fixed pitch offset newPitch += origPitch; i = (i + 1) & 7; // steps from 0 to 7 repeatedly err = SetSpeechPitch(chan, newPitch); if ((err != noErr) || Button()) { err = StopSpeech(chan); break; } } err = DisposeSpeechChannel(chan); } } if (err != noErr) NotSoCoolAlertRoutine(err);Advanced RoutinesThis section describes several advanced or rarely-used Speech Manager routines. You can use them to improve the quality of your application’s speech.Advanced Speech ControlsThe StopSpeech routine, described in “Starting and Stopping Speech,” earlier in this document, provides a simple way to interrupt any speech output instantly. In some situations it is preferable to be able to stop speech production at the next natural boundary, such as the next word or the end of the current sentence. StopSpeechAt provides that capability.Similarly, the PauseSpeechAt routine causes speech to pause at a specified point in the text being spoken; the ContinueSpeech routine resumes speech after it has paused.In addition to SpeakString and SpeakText, described earlier in this document, the Speech Manager provides a third, more general routine. SpeakBuffer is the low-level speech routine upon which the other two are built. SpeakBuffer provides greater control through the use of an additional flags parameter.The SpeechBusySystemWide routine tells you if any speech is currently being synthesized in your application or elsewhere on the computer.StopSpeechAtThe StopSpeechAt routine halts speech at a specific point in the text being spoken.pascal OSErr StopSpeechAt (SpeechChannel chan, long whereToStop);enum { kImmediate = 0, kEndOfWord = 1, kEndOfSentence = 2};Field descriptionschan Specific speech channelwhereToStop Location in text at which speech is to stop DESCRIPTIONStopSpeechAt is used to halt the production of speech at a specified point in the text. The whereToStop argument should be set to one of the following constants:n The kImmediate constant stops speech output immediately. n The kEndOfWord constant lets speech continue until the current word has been spoken. n The kEndOfSentence constant lets speech continue until the end of the current sentence has been reached. This routine returns immediately, although speech output continues until the specified point has been reached.sWARNINGYou must not release the memory associated with the current text buffer until the channel status indicates that the speech channel output is no longer busy.sIf the end of the input text buffer is reached before the specified stopping point, the speech synthesizer will stop at this point. Once the stopping point has been reached, the application is free to release the text buffer. Calling StopSpeechAt with whereToStop equal to kImmediate is equivalent to calling StopSpeech, described on page 14. Contrast the StopSpeechAt routine with PauseSpeech, described next.RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter PauseSpeechAtThe PauseSpeechAt routine causes speech to pause at a specified point in the text being spoken.pascal OSErr PauseSpeechAt (SpeechChannel chan, long whereToPause);enum { kImmediate = 0, kEndOfWord = 1, kEndOfSentence = 2};Field descriptionschan Specific speech channelwhereToPause Location in text at which speech is to pause DESCRIPTIONPauseSpeech makes speech production pause at a specified point in the text. The whereToPause parameter should be set to one of these constants: n The kImmediate constant stops speech output immediately. n The kEndOfWord constant lets speech continue until the current word has been spoken. n The kEndOfSentence constant lets speech continue until the end of the current sentence has been reached. When the specified point is reached, the speech channel enters the paused state, reflected in the channel’s status. PauseSpeechAt returns immediately, although speech output will continue until the specified point.If the end of the input text buffer is reached before the specified pause point, speech output pauses at the end of the buffer.PauseSpeechAt differs from StopSpeech and StopSpeechAt in that a subsequent call to ContinueSpeech, described next, causes the contents of the current text buffer to continue being spoken.sWARNINGWhile in a paused state, the last text buffer must remain available at all times and must not move. While paused, the SpeechChannel status indicates outputBusy = true and outputPaused = true.sRESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter ContinueSpeechThe ContinueSpeech routine resumes speech after it has been halted by the PauseSpeechAt routine.pascal OSErr ContinueSpeech (SpeechChannel chan);Field descriptionschan Specific speech channelDESCRIPTIONAt any time after PauseSpeechAt is called, ContinueSpeech may be called to continue speaking from the point at which speech paused. Calling ContinueSpeech on a channel that is not currently in a pause state has no effect; calling it before a pause is effective cancels the pause.RESULT CODESnoErr 0 No error invalidComponentID –3000 Invalid SpeechChannel parameter SpeakBufferThe SpeakBuffer routine causes the contents of a text buffer to be spoken, using certain flags to control speech behavior.pascal OSErr SpeakBuffer (SpeechChannel chan, Ptr textBuf, long byteLength, long controlFlags);enum { kNoEndingProsody = 1, kNoSpeechInterrupt = 2, kPreflightThenPause = 4};Field descriptionschan Specific speech channeltextBuf Buffer of textbyteLength Length of textBufcontrolFlags Control flags to control speech behavior DESCRIPTIONWhen the controlFlags parameter is set to 0, SpeakBuffer behaves identically to SpeakText, described on page 13.The kNoEndingProsody flag bit is used to control whether or not the speech synthesizer automatically applies ending prosody, the speech tone and cadence that normally occur at the end of a statement. Under normal circumstances (for example, when the flag bit is not set), ending prosody is applied to the speech when the end of the textBuf data is reached. This default behavior can be disabled by setting the kNoEndingProsody flag bit.Some synthesizers do not speak until the kNoEndingProsody flag bit is reset, or they encounter a period in the text, or textBuf is full. The kNoSpeechInterrupt flag bit is used to control the behavior of SpeakBuffer when called on a speech channel that is still busy. When the flag bit is not set, SpeakBuffer behaves similarly to SpeakString and SpeakText, described earlier in this document. Any speech currently being produced on the specified speech channel is immediately interrupted and then the new text buffer is spoken. When the kNoSpeechInterrupt flag bit is set, however, a request to speak on a channel that is still busy processing a prior text buffer will result in an error. The new buffer is ignored and the error synthNotReady is returned. If the prior text buffer has been fully processed, the new buffer is spoken normally. The kPreflightThenPause flag bit is used to minimize the latency experienced when attempting to speak. Ordinarily whenever a call to SpeakString, SpeakText, or SpeakBuffer is made, the speech synthesizer must perform a certain amount of initial processing before speech output is heard. This startup latency can vary from a few milliseconds to several seconds depending upon which speech synthesizer is being used. Recognizing that larger startup delays may be detrimental to certain applications, a mechanism is provided to provide the synthesizer a chance to perform any necessary computations at noncritical times. Once the computations have been completed, the speech is able to start instantly. When the kPreflightThenPause flag bit is set, the speech synthesizer will process the input text as necessary to the point where it is ready to begin producing speech output. At this point, the synthesizer will enter a paused state and return to the caller. When the application is ready to produce speech, it should call the ContinueSpeech routine to begin speaking.RESULT CODESnoErr 0 No error synthNotReady –242 Speech channel is still busy speaking invalidComponentID –3000 Invalid SpeechChannel parameter SpeechBusySystemWideYou can use SpeechBusySystemWide to determine if any speech is currently being synthesized in your application or elsewhere on the computer.pascal short SpeechBusySystemWide (void);DESCRIPTIONThis routine is useful when you want to ensure that no speech is currently being produced anywhere on the Macintosh computer. SpeechBusySystemWide returns the total number of speech channels currently synthesizing speech on the computer, whether they were initiated by your code or by some other process executing concurrently.RESULT CODESNoneConverting Text Into PhonemesIn some situations it is desirable to convert a text string into its equivalent phonemic representation. This may be useful during the content development process to fine-tune the pronunciation of particular words or phrases. By first converting the target phrase into phonemes, you can see what the synthesizer will try to speak. Then you need only correct the parts that would not have been spoken the way you want.TextToPhonemesThe TextToPhonemes routine converts a designated text to phoneme codes.pascal OSErr TextToPhonemes (SpeechChannel chan, Ptr textBuf, long textBytes, Handle phonemeBuf, long *phonemeBytes);Field descriptionschan Specific speech channeltextBuf Buffer of texttextBytes Length of textBuf in bytesphonemeBuf Buffer of phonemes*phonemeBytes Pointer to length of phonemeBuf in bytesDESCRIPTIONIt may be useful to convert your text into phonemes during application development in order to be able to reduce the amount of memory required to speak. If your application does not require the text-to-phoneme conversion portion of the speech synthesizer, significantly less RAM may be required to speak with some synthesizers. Additionally, you may be able to use a higher quality text-to-phoneme conversion process (even one that does not work in real time) to generate precise phonemic information. This data can then be used with any speech synthesizer to produce better speech.TextToPhonemes accepts a valid SpeechChannel parameter, a pointer to the characters to be converted into phonemes, the length of the input text buffer in bytes, an application-supplied handle into which the converted phonemes can be written, and a length parameter. On return, the phonemeBytes argument is set to the number of phoneme character bytes that were written into phonemeBuf. The data returned by TextToPhonemes will correspond precisely to the phonemes that would be spoken had the input text been sent to SpeakText instead. All current mode settings are applied to the converted speech. No callbacks are generated while the TextToPhonemes routine is generating its output.RESULT CODESnoErr 0 No error paramErr –50 Parameter value is invalid nilHandleErr –109 Handle argument is nil siUnknownInfoType –231 Feature not implemented on synthesizer invalidComponentID –3000 Invalid SpeechChannel parameter Getting Information About a Speech ChannelSeveral additional types of information have been made available for advanced users of the Speech Manager. This information provides more detailed status information for each channel. You can get this information by calling the GetSpeechInfo routine. This function accepts selectors that determine the type of information you want to get.NoteThroughout this document, there are several references to parameter values specified with fixed-point integer values (pbas, pmod, rate, and volm). Unless otherwise stated, the full range of values of the Fixed data type is valid. However, it is left to the individual speech synthesizer implementation to determine whether or not to use the full resolution and range of the Fixed data type. In the event a specified parameter value lies outside the range supported by a particular synthesizer, the synthesizer will substitute the value closest to the specified value that does lie within its performance specifications.uGetSpeechInfoThe GetSpeechInfo routine returns information about a designated speech channel.pascal OSErr GetSpeechInfo (SpeechChannel chan, OSType selector, void *speechInfo);enum {soStatus = 'stat', // gets speech output status soErrors = 'erro', // gets error status soInputMode = 'inpt', // gets current text/phon mode soCharacterMode = 'char', // gets current character mode soNumberMode = 'nmbr', // gets current number mode soRate = 'rate', // gets current speaking rate soPitchBase = 'pbas', // gets current baseline pitch soPitchMod = 'pmod', // gets current pitch modulation soVolume = 'volm', // gets current speaking volume soSynthType = 'vers', // gets speech synth version info soRecentSync = 'sync', // gets most recent sync message info soPhonemeSymbols = 'phsy', // gets phoneme symbols & ex. words soSynthExtension = 'xtnd' // gets synthesizer-specific info };Field descriptionschan Specific speech channelselector Used to specify data being requested*speechInfo Pointer to an information structureDESCRIPTIONThe following list of selectors describes the various types of information that can be obtained from the Speech Manager. The format of the information returned depends on which value is used in the selector field, as follows:NoteFor future code compatibility, use the application programming interface (API) labels instead of literal selector values.uField descriptionsstat Gets various items of status information for the specified channel. Indicates whether any speech audio is being generated, whether or not the channel has paused, how many bytes in the input text have yet to be processed, and the phoneme code for the phoneme that is currently being generated. If inputBytesLeft is 0, the input buffer is no longer needed and can be disposed of. The API label for this selector is soStatus. typedef SpeechStatusInfo *speechInfo; typedef struct SpeechStatusInfo { Boolean outputBusy; // true = audio playing Boolean ouputPaused; // true = channel paused long inputBytesLeft; // bytes left to process short phonemeCode; // current phoneme code } SpeechStatusInfo;erro Gets saved error information and clears the error registers. This selector lets you poll for various run-time errors that occur during speaking, such as the detection of badly formed embedded commands. Errors returned directly by Speech Manager routines are not reported here. The count field shows how many errors have occurred since the last check. If count is 0 or 1, then oldest and newest will be the same. Otherwise, oldest contains the error code for the oldest unread error and newest contains the error code for the most recent error. Both oldPos and newPos contain the character positions of their respective errors in the original input text buffer. The API label for this selector is soErrors. typedef SpeechErrorInfo *speechInfo; typedef struct SpeechErrorInfo { short count; // # of errs since last check OSErr oldest; // oldest unread error long oldPos; // char position of oldest err OSErr newest; // most recent error long newPos; // char position of newest err } SpeechErrorInfo;inpt Gets the current value of the text processing mode control. The returned value specifies whether the specified speech channel is currently in text-input mode (TEXT) or phoneme-input mode (PHON). The API label for this selector is soInputMode. typedef OSType *speechInfo; // TEXT or PHONchar Gets the current value of the character processing mode control. The returned value specifies whether the specified speech channel is currently processing input characters in normal mode (NORM) or in literal, letter-by-letter, mode (LTRL). The API label for this selector is soCharacterMode. typedef OSType *speechInfo; // NORM or LTRL nmbr Gets the current value of the number processing mode control. The returned value specifies whether the specified speech channel is currently processing input character digits in normal mode (NORM) or in literal, digit-by-digit, mode (LTRL). The API label for this selector is soNumberMode. typedef OSType *speechInfo; // NORM or LTRL rate Gets the current speaking rate in words per minute on the specified channel. Speaking rates are fixed-point values. The API label for this selector is soRate. typedef Fixed *speechInfo;NoteWords per minute is a convenient, if difficult to define, way of representing speaking rate. Although there is no universally accepted definition of words per minute, it does communicate approximate information about speaking rates. Any specific rate may correspond to different rates on different synthesizers, but the two rates should be reasonably close. More importantly, doubling the rate on a particular synthesizer should halve the time needed to speak any particular utterance.upbas Gets the current baseline pitch for the specified channel. The pitch value is a fixed-point integer that conforms to the following frequency relationship: Hertz = 440.0 * 2((BasePitch - 69) / 12) BasePitch of 1.0 ≈ 9 Hertz BasePitch of 39.5 ≈ 80 Hertz BasePitch of 45.8 ≈ 115 Hertz BasePitch of 50.4 ≈ 150 Hertz BasePitch of 100.0 ≈ 2637 HertzBasePitch values are always positive numbers in the range from 1.0 through 100.0. The API label for this selector is soPitchBase. typedef Fixed *speechInfo;pmod Gets the current pitch modulation range for the speech channel. Modulation values range from 0.0 through 100.0. A value of 0.0 corresponds to no modulation and means the channel will speak in a monotone. The API label for this selector is soPitchMod. Nonzero modulation values correspond to pitch and frequency deviations according to the following formula: Maximum pitch = BasePitch + PitchMod Minimum pitch = BasePitch - PitchMod Maximum Hertz = BaseHertz * 2(+ ModValue / 12) Minimum Hertz = BaseHertz * 2(- ModValue / 12) Given: BasePitch of 46.0 (≈ 115 Hertz), PitchMod of 2.0, Then: Maximum pitch = 48.0 (≈131 Hertz), Minimum pitch = 44.0 (≈104 Hertz) typedef Fixed *speechInfo;volm Gets the current setting of the volume control on the specified channel. Volumes are expressed in fixed-point units ranging from 0.0 through 1.0. A value of 0.0 corresponds to silence, and a value of 1.0 corresponds to the maximum possible volume. Volume units lie on a scale that is linear with amplitude or voltage. A doubling of perceived loudness corresponds to a doubling of the volume. The API label for this selector is soVolume. typedef Fixed *speechInfo;vers Gets descriptive information for the type of speech synthesizer being used on the specified speech channel. The API label for this selector is soSynthType. typedef SpeechVersionInfo *speechInfo; typedef struct SpeechVersionInfo { OSType synthType; // always 'ttsc' OSType synthSubType; // synth flavor OSType synthManufacturer; // synth creator long synthFlags; // reserved NumVersion synthVersion; // synth version } SpeechVersionInfo;sync Returns the sync message code for the most recently encountered embedded sync command at the audio output point. If no sync command has been encountered, 0 is returned. Refer to the section “Embedded Speech Commands,” later in this document, for information about sync commands. The API label for this selector is soRecentSync. typedef OSType *speechInfo;phsy Returns a list of phoneme symbols and example words defined for the current synthesizer. The input parameter is the address of a handle variable. On return, the PhonemeDescriptor parameter contains a handle to the array of phoneme definitions. Make sure to dispose of the handle when you are done using it. This information is normally used to indicate to the user the approximate sounds corresponding to various phonemes—an important feature in international speech. The API label for this selector is soPhonemeSymbols. typedef PhonemeDescriptor ***speechInfo; // VAR Handle typedef struct PhonemeInfo { short opcode; // opcode for the phoneme Str15 phStr; // corresponding char string Str31 exampleStr; // word that shows use of // phoneme short hiliteStart; // part of example word // to be hilighted as in short hiliteEnd; // TextEdit selections } PhonemeInfo; typedef struct PhonemeDescriptor { short phonemeCount; // # of elements PhonemeInfo thePhonemes[1]; // element list } PhonemeDescriptor;xtnd This call supports a general method for extending the functionality of the Speech Manager. It is used to get synthesizer-specific information. The format of the returned data is determined by the specific synthesizer queried. The speechInfo argument should be a pointer to the proper data structure. If a particular synthCreator value is not recognized by the synthesizer, the command is ignored and the siUnknownInfoType code is returned. The API label for this selector is soSynthExtension. typedef SpeechXtndData *speechInfo; typedef struct SpeechXtndData { OSType synthCreator; // synth creator ID Byte synthData[2]; // data TBD by synth } SpeechXtndData;RESULT CODESnoErr 0 No error siUnknownInfoType –231 Feature is not implemented on synthesizer invalidComponentID –3000 Invalid SpeechChannel parameter Advanced Control RoutinesThe Speech Manager provides numerous control features for sophisticated developers. These controls enable you to set various speaking parameters programmatically and provide a rich set of callback routines that can be used to notify applications of various conditions within the speaking process. They are extended by many speech synthesizers.These controls are accessed with the SetSpeechInfo routine. All calls to this routine expect a SpeechChannel parameter, a selector to indicate the desired function, and a pointer to some data. The format of this data depends on the particular selector and is documented in the following routine description.SetSpeechInfoThe SetSpeechInfo routine sets information for a designated speech channel.pascal OSErr SetSpeechInfo (SpeechChannel chan, OSType selector, void *speechInfo);enum { // Sets the parameter: soInputMode = 'inpt', // current text/phon mode soCharacterMode = 'char', // current character mode soNumberMode = 'nmbr', // current number mode soRate = 'rate', // current speaking rate soPitchBase = 'pbas', // current baseline pitch soPitchMod = 'pmod', // current pitch modulation soVolume = 'volm', // current speaking volume soCurrentVoice = 'cvox', // current speaking voice soCommandDelimiter = 'dlim', // command delimiters soReset = 'rset', // re channel to default state soCurrentA5 = 'myA5', // app's A5 on callbacks soRefCon = 'refc', // reference constant soTextDoneCallBack = 'tdcb', // text done callback proc soSpeechDoneCallBack = 'sdcb', // end-of-speech callback proc soSyncCallBack = 'sycb', // sync command callback proc soErrorCallBack = 'ercb', // error callback proc soPhonemeCallBack = 'phcb', // phoneme callback proc soWordCallBack = 'wdcb', // word callback proc soSynthExtension = 'xtnd' // synthesizer-specific info };Field descriptionschan Specific speech channelselector Used to specify data being requested*speechInfo Pointer to an information structureDESCRIPTIONThe following list of selectors outlines the controls available with the Speech Manager. The format of the information returned depends on which value is used in the selector field, as follows:NoteThe Speech Manager supports several callback features that can provide the sophisticated developer with a tight coupling to the speech synthesis process. However, these callbacks must be used carefully. Each is invoked from interrupt level. This means that you may not perform any operations that might cause memory to be allocated, purged, or moved. Although application global variables are also ordinarily not accessible at interrupt time, the soCurrentA5 myA5 selector described in the following text can be used to ask the Speech Manager to point register A5 at your application’s global variables prior to each callback. This makes it fairly painless to access global variables from your callback handlers. If this information worries you, don’t despair. Most information available through callbacks is also available through a GetSpeechInfo call. These calls are more friendly and do not come with the constraints imposed upon callback code. The only drawback is that if you do not poll the information you are interested in often enough, you may miss some of the changes in your speech channel’s status.uField descriptionsinpt Sets the current value of the text processing mode control. The passed value specifies whether the speech channel should be in text-input mode (TEXT) or phoneme-input mode (PHON). Input mode changes take effect as soon as possible; however, the precise latency is dependent upon the specific speech synthesizer. The API label for this selector is soInputMode. typedef OSType *speechInfo; // TEXT or PHON char Sets the current value of the character processing mode control. The passed value specifies whether the speech channel should be in normal character processing mode (NORM) or literal, letter-by-letter, mode (LTRL). Character mode changes take effect as soon as possible; however, the precise latency is dependent upon the specific speech synthesizer. The API label for this selector is soCharacterMode. typedef OSType *speechInfo; // NORM or LTRL nmbr Sets the current value of the number processing mode control. The passed value specifies whether the specified speech channel should be in normal number processing mode (NORM) or in literal, digit-by-digit, mode (LTRL). The number mode changes take effect as soon as possible. However, the precise latency is dependent upon the specific speech synthesizer. The API label for this selector is soNumberMode. typedef OSType *speechInfo; // NORM or LTRL rate Sets the speaking rate in words per minute on the specified channel. Speaking rates are fixed-point values. All values are valid; however, specific synthesizers will not necessarily be able to speak at all possible rates. The API label for this selector is soRate. typedef Fixed *speechInfo;pbas Changes the current baseline pitch for the specified channel. The pitch value is a fixed-point integer that conforms to the following frequency relationship: Hertz = 440.0 * 2((BasePitch - 69) / 12) BasePitch of 1.0 ≈ 9 Hertz BasePitch of 39.5 ≈ 80 Hertz BasePitch of 45.8 ≈ 115 Hertz BasePitch of 50.4 ≈ 150 Hertz BasePitch of 100.0 ≈ 2637 Hertz BasePitch values are always positive numbers in the range from 1.0 through 100.0. typedef Fixed *speechInfo; The API label for this selector is soPitchBase.pmod Changes the current pitch modulation range for the speech channel. Modulation values range from 0.0 through 100.0. A value of 0.0 corresponds to no modulation and means the channel will speak in a monotone. Nonzero modulation values correspond to pitch and frequency deviations according to the following formula: Maximum pitch = BasePitch + PitchMod Minimum pitch = BasePitch - PitchMod Maximum Hertz = BaseHertz * 2(+ ModValue / 12) Minimum Hertz = BaseHertz * 2(- ModValue / 12) Given : BasePitch of 46.0 (≈115 Hertz), PitchMod of 2.0, Then: Maximum pitch = 48.0 (≈131 Hertz), Minimum pitch = 46.0 (≈104 Hertz) typedef Fixed *speechInfo; The API label for this selector is soPitchMod.volm Changes the current speaking volume on the specified channel. Volumes are expressed in fixed-point units ranging from 0.0 through 1.0 . A value of 0.0 corresponds to silence, and a value of 1.0 corresponds to the maximum possible volume. Volume units lie on a scale that is linear with amplitude or voltage. A doubling of perceived loudness corresponds to a doubling of the volume. The API label for this selector is soVolume. typedef Fixed *speechInfo;cvox Changes the current voice on the current speech channel to the specified voice. Note that this control call will return an incompatibleVoice error if the specified voice is incompatible with the speech synthesizer associated with the speech channel. The API label for this selector is soCurrentVoice. typedef VoiceSpec *speechInfo;dlim Sets the delimiter character strings for embedded commands. The start of an embedded command is determined by comparing the input characters to the start-command delimiter string. Likewise, the end of a command is determined by comparing the input characters to the end-command delimiter string. Command delimiter strings are either 1 or 2 bytes in length. If a single byte delimiter is desired, it should be followed by a null (0) byte. Delimiter characters must come from the set of printable characters. If the delimiter strings are empty, this will have the effect of disabling embedded command processing. Care must be taken not to choose delimiter strings that might occur naturally in the text to be spoken. The API label for this selector is soCommandDelimiter. typedef DelimiterInfo *speechInfo; typedef struct DelimiterInfo { Byte startDelimiter[2]; // defaults to "[[" Byte endDelimiter[2]; // defaults to "]]" } DelimiterInfo;rset Resets the speech channel to its default states. The speechInfo parameter should be set to 0. Specific synthesizers may provide other reset capabilities. The API label for this selector is soReset. typedef long *speechInfo;myA5 An application uses this selector to request that the speech synthesizer set up an A5 world prior to all callbacks. In order for an application to access any of its global data, it is necessary that register A5 contain the correct value, since all global variables are referenced relative to register A5. If you pass a non-null value in the speechInfo parameter, the speech synthesizer will set register A5 to this value just before it calls one of your callback routines. The A5 register is restored to its original value when your callback routine returns. The API label for this selector is soCurrentA5. typedef Ptr speechInfo; A typical application would make the call to SetSpeechInfo with code like the following: myA5 = SetCurrentA5(); err = SetSpeechInfo (mySpeechChannel, soCurrentA5, myA5);refc Sets the reference constant associated with the specified channel. All callbacks generated for this channel will return this reference constant for use by the application. The application can use this value any way it wants to. The API label for this selector is soRefCon. typedef long *speechInfo;tdcb Enables the callback that signals that text input processing is done. Your callback routine is invoked when the current buffer of input text has been processed and is no longer needed by the speech synthesizer. This callback does not indicate that the synthesizer is finished speaking the text (see the sdcb callback description, next), merely that the input text has been fully processed and is no longer needed by the speech synthesizer. This callback can be disabled by passing a null ProcPtr in the speechInfo parameter. When your callback routine is invoked, you have two options. If you set the nextBuf, byteLen, and controlFlags variables before returning, you will enable the speech synthesizer to continue speaking without any interruption in the output. If you set the nextBuf parameter to null, you are indicating that you have no more text to speak. The controlFlags parameter is defined as in SpeakBuffer. The API label for this selector is soTextDoneCallBack. typedef Ptr speechInfo; pascal void MyInputDoneCallback (SpeechChannel chan, long refCon, Ptr *nextBuf, long *byteLen, long *controlFlags);sdcb Enables an end-of-speech callback. Your callback routine is called whenever an input text stream has been completely processed and spoken. When your callback routine is invoked, you can be certain that the speech channel is now idle and no audio is being generated. This callback can be disabled by passing a null ProcPtr in the speechInfo parameter. The API label for this selector is soSpeechDoneCallBack. typedef Ptr speechInfo; pascal void MyEndOfSpeechCallback (SpeechChannel chan, long refCon);sycb Enables the sync command callback. Your callback routine is invoked when the text following a sync embedded command is about to be spoken. This callback can be disabled by passing a null ProcPtr in the speechInfo parameter. See “Embedded Speech Commands,” later in this document, for a description of how to use sync commands. The API label for this selector is soSyncCallBack. typedef Ptr speechInfo; pascal void MySyncCommandCallback (SpeechChannel chan, long refCon, OSType syncMessage);ercb Enables error callbacks. Your callback routine is called whenever an error occurs during the processing of an input text stream. Errors can result from syntax problems in the input text, insufficient CPU processing speed (such as an audio data underrun), or other conditions that may arise during the speech conversion process. If error callbacks have not been enabled, when an error condition is detected, the Speech Manager will save its value. The error codes can then be read using the GetSpeechInfo status selector soErrors (erro). The error callback can be disabled by passing a null ProcPtr in the speechInfo parameter. The API label for this selector is soErrorCallBack. typedef Ptr speechInfo; pascal void MyErrorCallback (SpeechChannel chan, long refCon, OSErr error, long bytePos);phcb Enables phoneme callbacks. Your callback routine is invoked for each phoneme generated by the speech synthesizer just before the phoneme is actually spoken. This callback can be disabled by passing a null ProcPtr in the speechInfo parameter. The API label for this selector is soPhonemeCallBack. typedef Ptr speechInfo; pascal void MyPhonemeCallBack (SpeechChannel chan, long refCon, short phonemeOpcode);wdcb Enables word callbacks. Your callback routine is invoked for each word generated by the speech synthesizer just before the word is actually spoken. This callback can be disabled by passing a nil ProcPtr in the speechInfo parameter. The API label for this selector is soWordCallBack. typedef Ptr speechInfo; pascal void MyWordCallback (SpeechChannel chan, long refCon, long wordPos, short wordLen);xtnd This call supports a general method for extending the functionality of the Speech Manager. It is used to set synthesizer-specific information. The speechInfo argument should be a pointer to the appropriate data structure. If a particular synthCreator value is not recognized by the synthesizer, the command is ignored and an siUnknownInfoType code is returned. The API label for this selector is soSynthExtension. typedef SpeechXtndData *speechInfo; typedef struct SpeechXtndData { OSType synthCreator; // synth creator ID Byte synthData[2]; // data TBD by synth } SpeechXtndData;RESULT CODESnoErr 0 No error paramErr –50 Parameter value is invalid siUnknownInfoType –231 Feature is not implemented on synthesizer incompatibleVoice –245 Specified voice cannot be used with synthesizer invalidComponentID –3000 Invalid SpeechChannel parameter Application-Defined Pronunciation DictionariesNo matter how sophisticated a speech synthesis system is, there will always be words that it does not automatically pronounce correctly. The clearest instance of words that are often mispronounced is the class of proper names (names of people, place names, and so on). One way to get around this fundamental limitation is to use a dictionary of pronunciations. Whenever a speech synthesizer needs to determine the proper phonemic representation for a particular word, it first looks for the word in its dictionaries. Pronunciation dictionary entries contain information that enables precise conversion between text and the correct phoneme codes. They also provide stress, intonation, and other information to help speech synthesizers produce more natural speech. If the word in question is found in the dictionary, then the synthesizer uses the information from the dictionary entry rather than relying on its own letter-to-sound rules. The use of phonemes is described in “Summary of Phonemes and Prosodic Controls,” later in this document.The Speech Manager word storage format provides high-quality data that is interchangeable between speech synthesizers. The Speech Manager also uses an easily extensible dictionary structure that does not affect the usability of existing dictionaries.It is assumed that application-defined pronunciation dictionaries will reside in RAM when in use. The run-time structure of dictionary data presumably depends on the specific needs of particular speech synthesizers and will therefore differ from the structure of the dictionaries as stored on disk.Associating a Dictionary With a Speech ChannelThe following routines can be used to associate an application-defined pronunciation dictionary with a particular speech channel.UseDictionaryThe UseDictionary routine associates a designated dictionary with a specific speech channel.pascal OSErr UseDictionary (SpeechChannel chan, Handle dictionary);Field descriptionschan Specific speech channeldictionary Handle to the specified dictionaryDESCRIPTIONThe speech synthesizer will attempt to use the dictionary data pointed to by the dictionary handle argument to augment the built-in pronunciation rules on the specified speech channel. The synthesizer will use whatever elements of the dictionary resource it considers useful to the speech conversion process. After returning from UseDictionary, the caller is free to release any storage allocated for the dictionary handle. The search order for application-provided dictionaries is last in, first searched.All details of how an application-provided dictionary is represented within the speech synthesizer are dependent on the specific synthesizer implementation and are totally private to the synthesizer.RESULT CODESnoErr 0 No error memFullErr –108 Not enough memory to use new dictionary badDictFormat –246 Format problem with pronunciation dictionary invalidComponentID –3000 Invalid SpeechChannel parameter Pronunciation Dictionary Data FormatEach application-defined pronunciation dictionary is implemented as a single resource of type 'dict'. To read the dictionary contents, the system first reads the resource into memory using Resource Manager routines.An application dictionary contains the following information:total byte length (long) (Length is all-inclusive) atom type (long) format version (long) script code (short) language code (short) region code (short) date last modified (long) (Seconds since January 1, 1904) reserved(4) (long) entry count (long) list of entries The currently defined atom type is'dict ' Æ Dictionary Each entry consists of the following:entry byte length (short) (Length is all-inclusive) entry type (short) field count (short) list of fields The currently defined entry types are the following:0x00 Æ Null entry 0x01 to 0x20 Æ Reserved 0x21 Æ Pronunciation entry 0x22 Æ Abbreviation entry Each field consists of the following:field byte length (short) (Length is all-inclusive minus padding) field type (short) field data (char[]) (Data is padded to word boundary) The currently defined field types are the following:0x00 Æ Null field 0x01 to 0x20 Æ Reserved 0x21 Æ Word represented in textual format. 0x22 Æ Phonemic pronunciation including a complete set of syllable, lexical stress, word prominence, and prosodic markers represented in textual format 0x23 Æ Part-of-speech code Creating and Editing DictionariesThere is no built-in support for creating and editing speech dictionaries. You can create dictionary resources using any of the available resource editing tools such as the MPW Rez tool or ResEdit. Of course, you can also fairly easily develop routines to edit the dictionary structure from within the application. At the present time, no assumption should be made that the entries in a dictionary are stored in sorted order.Advanced Voice Information RoutinesOrdinarily, an application should need to use only the GetVoiceDescription routine to access information about a particular voice. Occasionally, however, it may be necessary to obtain more detailed information by using the GetVoiceInfo routine.GetVoiceInfoThe GetVoiceInfo routine returns information about a specified voice channel beyond that obtainable through the GetVoiceDescription routine.pascal OSErr GetVoiceInfo (VoiceSpec *voice, OSType selector, void *voiceInfo); typedef VoiceDescription *voiceInfo; typedef VoiceFileInfo *voiceInfo; typedef struct VoiceFileInfo { FSSpec fileSpec; // vol, dir, name info for voice file short resID; // resource ID of voice in the file } VoiceFileInfo;enum { soVoiceDescription = 'info', // gets basic voice info soVoiceFile = 'fref' // gets voice file ref info };Field descriptions*voice Specific speech channelselector Used to specify data being requested*voiceInfo Pointer to an information structureDESCRIPTIONThis function accepts selectors that determine the type of information you want to get. The format of the information returned depends on which value is used in the selector field, as follows:Field descriptionsinfo Gets basic information for the specified voice. The structure returned is functionally equivalent to the VoiceDescription data structure in GetVoiceDescription, described earlier in this document. To maximize compatibility with future versions of the Speech Manager, the application must set the length field of the VoiceDescription structure to the size of the existing record before calling GetVoiceInfo, which then returns the size of the new record.fref Gets file reference information for specified voice; normally only used by speech synthesizers to access voice disk files directly.RESULT CODESnoErr 0 No error memFullErr –108 Not enough memory to load voice into memory voiceNotFound –244 Voice resource not found Embedded Speech CommandsThis section describes how you can insert commands directly into the input text to control or modify the spoken output. When processing input text data, speech synthesizers look for special sequences of characters called delimiters. These character sequences are usually defined to be unusual pairings of printable characters that would not normally appear in the text. When a begin command delimiter string is encountered in the text, the following characters are assumed to contain one or more commands. The synthesizer will attempt to parse and process these commands until an end command delimiter string is encountered.Embedded Speech Command SyntaxBy default, the begin command and end command delimiters are defined to be [[ and ]]. The syntax of embedded command blocks is given below, according to these rules: n Items enclosed in angle brackets (< and >) represent logical units that are either defined further below or are atomic units that should be self-explanatory. n Items enclosed in brackets are optional. n Items followed by an ellipsis (…) may be repeated one or more times. n For items separated by a vertical bar (|), any one of the listed items may be used. n Multiple space characters between tokens may be used if desired. n Multiple commands should be separated by semicolons.All other characters that are not enclosed between angle brackets must be entered literally. There is no limit to the number of commands that can be included in a single command block.Identifier Syntax CommandBlock <BeginDelimiter> <CommandList> <EndDelimiter> BeginDelimiter <String1> | <String2> EndDelimiter <String1> | <String2> CommandList <Command> [; <Command>]… Command <CommandSelector> [Parameter]… CommandSelector <OSType> Parameter <OSType> | <String1> | <String2> | <StringN> | <FixedPointValue> |<32BitValue> | <16BitValue> | <8BitValue> String1 <QuoteChar> <Character> <QuoteChar> String2 <QuoteChar> <Character> <Character> <QuoteChar> StringN <QuoteChar> [<Character>]… <QuoteChar> QuoteChar " | ' OSType <4 character pattern (e.g., RATE, vers, aBcD)> Character <Any printable character (example A, b, *, #, x)> FixedPointValue <Decimal number: 0.0000 £ N £ 65535.9999> 32BitValue <OSType> | <LongInt> | <HexLongInt> 16BitValue <Integer> | <HexInteger> 8BitValue <Byte> | <HexByte> LongInt <Decimal number: 0 £ N £ 4294967295> HexLongInt <Hex number: 0x00000000 £ N £ 0xFFFFFFFF> Integer <Decimal number: 0 £ N £ 65535> (continued) HexInteger <Hex number: 0x0000 £ N £ 0xFFFF> Byte <Decimal number: 0 £ N £ 255> HexByte <Hex number: 0x00 £ N £ 0xFF> Here is the embedded command syntax structure:Embedded Speech Command SetTable 1-1 outlines the set of currently defined embedded speech commands.Table 1-1 Embedded speech commands(continued)Command Selector Command syntax and description Version vers vers <Version>Version::= <32BitValue>This command informs the synthesizer of the format version that will be used in subsequent commands. This command is optional but is highly recommended. The current version is 1. Delimiter dlim dlim <BeginDelimiter> <EndDelimiter>The delimiter command specifies the character sequences that mark the beginning and end of all subsequent commands. The new delimiters take effect at the end of the current command block. If the delimiter strings are empty, an error is generated. (Contrast this behavior with the dlim function of SetSpeechInfo.) Comment cmnt cmnt [Character]…This command enables a developer to insert a comment into a text stream for documentation purposes. Note that all characters following the cmnt selector up to the <EndDelimiter> are part of the comment. Reset rset rset <32BitValue>The reset command will reset the speech channel’s settings back to the default values. The parameter should be set to 0. (continued) Baseline pitch pbas pbas [+ | -] <Pitch>Pitch ::= <FixedPointValue>The baseline pitch command changes the current pitch for the speech channel. The pitch value is a fixed-point number in the range 1.0 through 100.0 that conforms to the frequency relationship Hertz = 440.0 * 2((Pitch – 69) / 12)If the pitch number is preceded by a + or – character, the baseline pitch is adjusted relative to its current value. Pitch values are always positive numbers. For further details, see “SetSpeechInfo,” earlier in this document. Pitch modulation pmod pmod [+ | -] <ModulationDepth>ModulationDepth ::= <FixedPointValue>The pitch modulation command changes the modulation range for the speech channel. The modulation value is a fixed-point number in the range 0.0 through 100.0 that conforms to the following pitch and frequency relationships:Maximum pitch = BasePitch + PitchModMinimum pitch = BasePitch - PitchModMaximum Hertz = BaseHertz * 2(+ ModValue / 12)Minimum Hertz = BaseHertz * 2(– ModValue / 12)A value of 0.0 corresponds to no modulation and will cause the speech channel to speak in a monotone. If the modulation depth number is preceded by a + or – character, the pitch modulation is adjusted relative to its current value. For further details, see “SetSpeechInfo,” earlier in this document. Speaking rate rate rate [+ | -] <WordsPerMinute>WordsPerMinute ::= <FixedPointValue>The speaking rate command sets the speaking rate in words per minute on the speech channel. If the rate value is preceded by a + or – character, the speaking rate is adjusted relative to its current value. (continued) Volume volm volm [+ | -] <Volume>Volume::= <FixedPointValue>The volume command changes the speaking volume on the speech channel. Volumes are expressed in fixed-point units ranging from 0.0 through 1.0 . A value of 0.0 corresponds to silence, and a value of 1.0 corresponds to the maximum possible volume. Volume units lie on a scale that is linear with amplitude or voltage. A doubling of perceived loudness corresponds to a doubling of the volume. Sync sync sync <SyncMessage>SyncMessage::= <32BitValue>The sync command causes a callback to the application’s sync command callback routine. The callback is made when the audio corresponding to the next word begins to sound. The callback routine is passed the SyncMessage value from the command. If the callback routine has not been defined, the command is ignored. For further details, see “SetSpeechInfo,” earlier in this document. Input mode inpt inpt TX | TEXT | PH | PHONThis command switches the input processing mode to either normal text mode or raw phoneme mode. Character mode char char NORM | LTRLThe character mode command sets the word speaking mode of the speech synthesizer. When NORM mode is selected, the synthesizer attempts to automatically convert words into speech. This is the most basic function of the text-to-speech synthesizer. When LTRL mode is selected, the synthesizer speaks every word, number, and symbol letter by letter. Embedded command processing continues to function normally, however. Number mode nmbr nmbr NORM | LTRLThe number mode command sets the number speaking mode of the speech synthesizer. When NORM mode is selected, the synthesizer attempts to automatically speak numeric strings as intelligently as possible. When LTRL mode is selected, numeric strings are spoken digit by digit. (continued) Silence slnc slnc <Milliseconds>Milliseconds ::= <32BitValue>The silence command causes the synthesizer to generate silence for the specified amount of time. Emphasis emph emph + | -The emphasis command causes the next word to be spoken with either greater emphasis or less emphasis than would normally be used. Using + will force added emphasis, while using – will force reduced emphasis. Synthesizer-Specific xtnd xtnd <SynthCreator> [parameter]SynthCreator ::= <OSType>The extension command enables synthesizer-specific commands to be embedded in the input text stream. The format of the data following SynthCreator is entirely dependent on the synthesizer being used. If a particular SynthCreator is not recognized by the synthesizer, the command is ignored but no error is generated. Synthesizers often support embedded commands that extend the set given in Table 1-1. Embedded Speech Command Error ReportingWhile embedded speech commands are being processed, several types of errors may be detected and reported to your application. If you have set up an error callback handler with the soErrorCallBack selector of the SetSpeechInfo routine (described earlier), you will be notified once for every error that is detected. If you have not enabled error callbacks, you can still obtain information about the errors encountered by calling GetSpeechInfo with the soErrors selector (also described earlier). The following errors are detected during processing of embedded speech commands:badParmVal –245 Parameter value is invalid badCmdText –246 Embedded command syntax or parameter problem unimplCmd –247 Embedded command is not implemented on synthesizer unimplMsg –248 Raw phoneme text contains invalid characters badVoiceID –250 Specified voice has not been preloaded badParmCount –252 Incorrect number of embedded command arguments found Summary of Phonemes and Prosodic ControlsThis section summarizes the phonemes and prosodic controls used by American English speech synthesizers.Phoneme SetTable 1-2 summarizes the set of standard phonemes recognized by American English speech synthesizers.In this description, it is assumed that specific rules and markers apply only to general American English. Other languages and dialects require different phoneme inventories. Phonemes divide into two groups: vowels and consonants. All vowel symbols are uppercase pairs of letters. For consonants, in cases in which the correspondence between the consonant and its symbol is apparent, the symbol is that lowerrcase consonant; in other cases, the symbol is an uppercase consonant. Within the example words, the individual sounds being exemplified appear in bold face.Table 1-2 American English phoneme symbols(continued)Symbol Example Opcode Symbol Example Opcode AE bat 2 b bin 18 EY bait 3 C chin 19 AO caught 4 d din 20 AX about 5 D them 21 IY beet 6 f fin 22 EH bet 7 g gain 23 IH bit 8 h hat 24 AY bite 9 J gin 25 IX roses 10 k kin 26 AA cot 11 l limb 27 UW boot 12 m mat 28 UH book 13 n nat 29 UX bud 14 N tang 30 OW boat 15 p pin 31 AW bout 16 r ran 32 OY boy 17 s sin 33 S shin 34 t tin 35 T thin 36 v van 37 w wet 38 (continued) y yet 39 % silence 0 z zen 40 @ breath intake 1 Z genre 41 NoteThe “silence” phoneme (%) and the “breath” phoneme (@) may be lengthened or shortened like any other phoneme.uProsodic ControlsThe symbols listed in Table 1-3 are recognized as modifiers to the basic phonemes described in the preceding section. They can be used to more precisely control the quality of speech that is described in terms of raw phonemes.Table 1-3 Prosodic control symbols(continued)Type Symbol Description of effect Lexical stress: Marks stress within a word Primary stress 1 anticipation AEnt2IHsIXp1EYSAXn (“anticipation”) Secondary stress 2 anticipation Syllable breaks: Marks syllable breaks within a word Syllable mark = (equal) AEn=t2IH=sIX=p1EY=SAXn (“anticipation”)Marks the beginning of a word (required) Word prominence: Unstressed ~ (asciitilde) Used for words with minimal information content Normal stress _ (underscore) Used for information-bearing words Emphatic stress + (plus) special emphasis for a wordPlaced before the affected phonemepitch will rise on the following phoneme Prosodic Pitch rise / (slash) Pitch fall \ (backslash) pitch will fall on the following phoneme Lengthen phoneme > (greater) lengthen the duration of the following phoneme Shorten phoneme < (less) shorten the duration of the following phoneme (continued) Punctuation: Pitch effect Timing effect . (period) Sentence final fall Pause follows ? (question) Sentence final rise Pause follows ! (exclam) Sentence final sharp fall Pause follows … (ellipsis) Clause final level Pause follows , (comma) Continuation rise Short pause follows ; (semicolon) Continuation rise Short pause follows : (colon) Clause final level Short pause follows ( (parenleft) Start reduced range Short pause precedes ) (parenright) End reduced range Short pause follows “‘ (quotedblleft, quotesingleleft) Varies Varies ”’ (quotedblright, quotesingleright) Varies Varies - (hyphen) Clause-final level Short pause follows & (ampersand) Forces no addition of silence between phonemes Specific pitch contours associated with these punctuation marks may vary according to other considerations in the analysis of the text, such as whether a question is rhetorical or begins with a wh question word, so the above effects should be regarded only as guidelines and not absolute. This also applies to the timing effects, which will vary according to the current rate setting.The prosodic control symbols (/, \, <, and >) may be concatenated to provide more exaggerated, cumulative effects. The specific nature of the effect is dependent on the speech synthesizer. Speech synthesizers also often extend or enhance the controls described in this section. Summary of the Speech ManagerConstants#define gestaltSpeechAttr 'ttsc' // Gestalt Manager selector for speech attributes enum { gestaltSpeechMgrPresent = 0 // Gestalt bit that indicates that Speech Manager exists };#define kTextToSpeechSynthType 'ttsc' // text-to-speech synthesizer component type #define kTextToSpeechVoiceType 'ttvd' // text-to-speech voice resource type #define kTextToSpeechVoiceFileType 'ttvf' // text-to-speech voice file type #define kTextToSpeechVoiceBundleType 'ttvb' // text-to-speech voice bundle file type enum { // Speech Manager error codes (range from 240 - 259) noSynthFound = -240, synthOpenFailed = -241, synthNotReady = -242, bufTooSmall = -243, voiceNotFound = -244, incompatibleVoice = -245, badDictFormat = -246, badPhonemeText = -247};enum { // constants for SpeakBuffer and text done callback controlFlags bits kNoEndingProsody = 1, kNoSpeechInterrupt = 2, kPreflightThenPause = 4};enum { // constants for StopSpeechAt and PauseSpeechAt kImmediate = 0, kEndOfWord = 1, kEndOfSentence = 2};// GetSpeechInfo & SetSpeechInfo selectors #define soStatus 'stat'#define soErrors 'erro'#define soInputMode 'inpt'#define soCharacterMode 'char'#define soNumberMode 'nmbr'#define soRate 'rate'#define soPitchBase 'pbas'#define soPitchMod 'pmod'#define soVolume 'volm'#define soSynthType 'vers'#define soRecentSync 'sync'#define soPhonemeSymbols 'phsy'#define soCurrentVoice 'cvox'#define soCommandDelimiter 'dlim'#define soReset 'rset'#define soCurrentA5 'myA5'#define soRefCon 'refc'#define soTextDoneCallBack 'tdcb'#define soSpeechDoneCallBack 'sdcb'#define soSyncCallBack 'sycb'#define soErrorCallBack 'ercb'#define soPhonemeCallBack 'phcb'#define soWordCallBack 'wdcb'#define soSynthExtension 'xtnd'// speaking mode constants #define modeText 'TEXT' // input mode constants #define modeTX 'TX'#define modePhonemes 'PHON'#define modePH 'PH'#define modeNormal 'NORM' // character mode and number mode constants #define modeLiteral 'LTRL'enum { // GetVoiceInfo selectors soVoiceDescription = 'info', // gets basic voice info soVoiceFile = 'fref' // gets voice file ref info };enum {kNeuter = 0, kMale, kFemale}; // returned in gender field below Data Typestypedef struct SpeechChannelRecord { long data[1];} SpeechChannelRecord;typedef SpeechChannelRecord *SpeechChannel;typedef struct VoiceSpec { OSType creator; // creator ID of required synthesizer OSType id; // voice ID on the specified synth } VoiceSpec;typedef struct VoiceDescription { long length; // size of structure - set by application VoiceSpec voice; // voice creator and ID info long version; // version code for voice Str63 name; // name of voice Str255 comment; // additional text info about voice short gender; // neuter, male, or female short age; // approximate age in years short script; // script code of text voice can process short language; // language code of voice output short region; // region code of voice output long reserved[4]; // reserved for future use } VoiceDescription;typedef struct VoiceFileInfo { FSSpec fileSpec; // volume, dir, & name information for voice file short resID; // resource ID of voice in the file } VoiceFileInfo;typedef struct SpeechStatusInfo { Boolean outputBusy; // true if audio is playing Boolean outputPaused; // true if channel is paused long inputBytesLeft; // bytes left to process short phonemeCode; // opcode for cur phoneme } SpeechStatusInfo;typedef struct SpeechErrorInfo { short count; // # of errs since last check OSErr oldest; // oldest unread error long oldPos; // char position of oldest err OSErr newest; // most recent error long newPos; // char position of newest err } SpeechErrorInfo;typedef struct SpeechVersionInfo { OSType synthType; // always 'ttsc' OSType synthSubType; // synth flavor OSType synthManufacturer; // synth creator ID long synthFlags; // synth feature flags NumVersion synthVersion; // synth version number } SpeechVersionInfo;typedef struct PhonemeInfo { short opcode; // opcode for the phoneme Str15 phStr; // corresponding char string Str31 exampleStr; // word that shows use of phoneme short hiliteStart; // segment of example word that short hiliteEnd; // hilighted text (ala TextEdit) } PhonemeInfo;typedef struct PhonemeDescriptor { short phonemeCount; // # of elements PhonemeInfo thePhonemes[1]; // element list } PhonemeDescriptor;typedef struct SpeechXtndData { OSType synthCreator; // synth creator ID Byte synthData[2]; // data TBD by synth } SpeechXtndData;typedef struct DelimiterInfo { Byte startDelimiter[2]; // defaults to [[ Byte endDelimiter[2]; // defaults to ]] } DelimiterInfo;Voice Routinespascal OSErr MakeVoiceSpec (OSType creator, OSType id, VoiceSpec *voice);pascal OSErr CountVoices (short *numVoices);pascal OSErr GetIndVoice (short index, VoiceSpec *voice);pascal OSErr GetVoiceDescription (VoiceSpec *voice, VoiceDescription *info, long infoLength);pascal OSErr GetVoiceInfo (VoiceSpec *voice, OSType selector, void *voiceInfo);Routines for Managing Speech Channelspascal OSErr NewSpeechChannel (VoiceSpec *voice, SpeechChannel *chan);pascal OSErr DisposeSpeechChannel (SpeechChannel chan);Speaking Routinespascal OSErr SpeakString (StringPtr s);pascal OSErr SpeakText (SpeechChannel chan, Ptr textBuf, long textBytes);pascal OSErr StopSpeech (SpeechChannel chan);pascal OSErr StopSpeechAt (SpeechChannel chan, long whereToStop);pascal OSErr PauseSpeechAt (SpeechChannel chan, long whereToPause);pascal OSErr ContinueSpeech (SpeechChannel chan);pascal OSErr SpeakBuffer (SpeechChannel chan, Ptr textBuf, long textBytes, long controlFlags);Information and Control Routinespascal NumVersion SpeechManagerVersion (void);pascal short SpeechBusy (void);pascal OSErr SetSpeechRate (SpeechChannel chan, Fixed rate);pascal OSErr GetSpeechRate (SpeechChannel chan, Fixed *rate);pascal OSErr SetSpeechPitch (SpeechChannel chan, Fixed pitch);pascal OSErr GetSpeechPitch (SpeechChannel chan, Fixed *pitch);pascal short SpeechBusySystemWide (void);pascal OSErr SetSpeechInfo (SpeechChannel chan, OSType selector, void *speechInfo);pascal OSErr GetSpeechInfo (SpeechChannel chan, OSType selector, void *speechInfo);Text-to-Phoneme Conversion Routinepascal OSErr TextToPhonemes (SpeechChannel chan, Ptr textBuf, long textBytes, Handle phonemeBuf, long *phonemeBytes) Dictionary Management Routinepascal OSErr UseDictionary (SpeechChannel chan, Handle dictionary)Callback Prototypes // text-done callback routine typedef typedef pascal void (*TextDoneProcPtr) (SpeechChannel, long, Ptr *, long *, long *); // speech-done callback routine typedef typedef pascal void (*SpeechDoneProcPtr) (SpeechChannel, long ); // sync callback routine typedef typedef pascal void (*SyncProcPtr) (SpeechChannel, long, OSType); // error callback routine typedef typedef pascal void (*ErrorProcPtr) (SpeechChannel, long, OSErr, long); // phoneme callback routine typedef typedef pascal void (*PhonemeProcPtr) (SpeechChannel, long, short); // word callback routine typedef typedef pascal void (*WordProcPtr) (SpeechChannel, long, long, short);Error Return CodesnoErr 0 No error paramErr –50 Parameter error memFullErr –108 Not enough memory to speak nilHandleErr –109 Handle argument is nil siUnknownInfoType –231 Feature not implemented on synthesizer noSynthFound –240 Could not find the specified speech synthesizer synthOpenFailed –241 Could not open another speech synthesizer channel synthNotReady –242 Speech synthesizer is still busy speaking bufTooSmall –243 Output buffer is too small to hold result voiceNotFound –244 Voice resource not found incompatibleVoice –245 Specified voice cannot be used with synthesizer badDictFormat –246 Format problem with pronunciation dictionary badPhonemeText –247 Raw phoneme text contains invalid characters invalidComponentID –3000 Invalid SpeechChannel parameter ö ◊# ˇ ˇˇˇˇ # ◊